Overview

Dataset statistics

Number of variables13
Number of observations44444
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.4 MiB
Average record size in memory104.0 B

Variable types

Numeric8
Categorical5

Alerts

imp_hash has constant value "fbcff5951ad0c204f4744c629548c6c6" Constant
filename has a high cardinality: 383 distinct values High cardinality
sha256 has a high cardinality: 872 distinct values High cardinality
sec_md5 has a high cardinality: 176 distinct values High cardinality
sec_name has a high cardinality: 1495 distinct values High cardinality
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
sec_chi2 is highly correlated with sec_entropyHigh correlation
sec_entropy is highly correlated with sec_chi2 and 1 other fieldsHigh correlation
raw_size is highly correlated with virtual_sizeHigh correlation
virtual_size is highly correlated with raw_sizeHigh correlation
virtual_address is highly correlated with sec_entropyHigh correlation
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
sec_entropy is highly correlated with raw_size and 2 other fieldsHigh correlation
raw_size is highly correlated with sec_entropy and 1 other fieldsHigh correlation
virtual_size is highly correlated with sec_entropy and 1 other fieldsHigh correlation
virtual_address is highly correlated with sec_entropyHigh correlation
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
raw_size is highly correlated with virtual_sizeHigh correlation
virtual_size is highly correlated with raw_sizeHigh correlation
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
sec_chi2 is highly correlated with raw_size and 1 other fieldsHigh correlation
sec_entropy is highly correlated with raw_size and 2 other fieldsHigh correlation
raw_size is highly correlated with sec_chi2 and 3 other fieldsHigh correlation
virtual_size is highly correlated with sec_chi2 and 3 other fieldsHigh correlation
virtual_address is highly correlated with sec_entropy and 2 other fieldsHigh correlation
df_index has unique values Unique
Unnamed: 0 has unique values Unique
sec_entropy has 37863 (85.2%) zeros Zeros

Reproduction

Analysis started2022-09-05 02:05:35.414278
Analysis finished2022-09-05 02:05:43.411799
Duration8 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct44444
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2755043.742
Minimum5245
Maximum5555048
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:43.471657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5245
5-th percentile702247.15
Q11021131.75
median2007641.5
Q34764338.25
95-th percentile5057719.85
Maximum5555048
Range5549803
Interquartile range (IQR)3743206.5

Descriptive statistics

Standard deviation1751777.652
Coefficient of variation (CV)0.6358438616
Kurtosis-1.718630857
Mean2755043.742
Median Absolute Deviation (MAD)1121038
Skewness0.1705083425
Sum1.224451641 × 1011
Variance3.068724942 × 1012
MonotonicityStrictly increasing
2022-09-05T12:05:43.692518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
52451
 
< 0.1%
45460721
 
< 0.1%
45414751
 
< 0.1%
45414761
 
< 0.1%
45414771
 
< 0.1%
45414781
 
< 0.1%
45414791
 
< 0.1%
45460691
 
< 0.1%
45460701
 
< 0.1%
45460711
 
< 0.1%
Other values (44434)44434
> 99.9%
ValueCountFrequency (%)
52451
< 0.1%
52461
< 0.1%
52471
< 0.1%
52481
< 0.1%
52491
< 0.1%
52501
< 0.1%
52511
< 0.1%
52521
< 0.1%
52531
< 0.1%
52541
< 0.1%
ValueCountFrequency (%)
55550481
< 0.1%
55550471
< 0.1%
55550461
< 0.1%
55550451
< 0.1%
55550441
< 0.1%
55550431
< 0.1%
55550421
< 0.1%
55550411
< 0.1%
55550401
< 0.1%
55550391
< 0.1%

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct44444
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2755043.742
Minimum5245
Maximum5555048
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:43.785789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5245
5-th percentile702247.15
Q11021131.75
median2007641.5
Q34764338.25
95-th percentile5057719.85
Maximum5555048
Range5549803
Interquartile range (IQR)3743206.5

Descriptive statistics

Standard deviation1751777.652
Coefficient of variation (CV)0.6358438616
Kurtosis-1.718630857
Mean2755043.742
Median Absolute Deviation (MAD)1121038
Skewness0.1705083425
Sum1.224451641 × 1011
Variance3.068724942 × 1012
MonotonicityStrictly increasing
2022-09-05T12:05:43.873371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
52451
 
< 0.1%
45460721
 
< 0.1%
45414751
 
< 0.1%
45414761
 
< 0.1%
45414771
 
< 0.1%
45414781
 
< 0.1%
45414791
 
< 0.1%
45460691
 
< 0.1%
45460701
 
< 0.1%
45460711
 
< 0.1%
Other values (44434)44434
> 99.9%
ValueCountFrequency (%)
52451
< 0.1%
52461
< 0.1%
52471
< 0.1%
52481
< 0.1%
52491
< 0.1%
52501
< 0.1%
52511
< 0.1%
52521
< 0.1%
52531
< 0.1%
52541
< 0.1%
ValueCountFrequency (%)
55550481
< 0.1%
55550471
< 0.1%
55550461
< 0.1%
55550451
< 0.1%
55550441
< 0.1%
55550431
< 0.1%
55550421
< 0.1%
55550411
< 0.1%
55550401
< 0.1%
55550391
< 0.1%

filename
Categorical

HIGH CARDINALITY

Distinct383
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size347.3 KiB
2022041900/2022041900_47
 
850
2022041900/2022041900_11
 
550
2022041900/2022041900_40
 
521
2022041900/2022041900_12
 
500
2022041900/2022041900_32
 
500
Other values (378)
41523 

Length

Max length33
Median length24
Mean length24.2697552
Min length23

Characters and Unicode

Total characters1078645
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20220329/2022032900/2022032900_10
2nd row20220329/2022032900/2022032900_10
3rd row20220329/2022032900/2022032900_10
4th row20220329/2022032900/2022032900_10
5th row20220329/2022032900/2022032900_10

Common Values

ValueCountFrequency (%)
2022041900/2022041900_47850
 
1.9%
2022041900/2022041900_11550
 
1.2%
2022041900/2022041900_40521
 
1.2%
2022041900/2022041900_12500
 
1.1%
2022041900/2022041900_32500
 
1.1%
2022041900/2022041900_56500
 
1.1%
2022041900/2022041900_55450
 
1.0%
2022041901/2022041901_1450
 
1.0%
2022041921/2022041921_52445
 
1.0%
2022041922/2022041922_1437
 
1.0%
Other values (373)39241
88.3%

Length

2022-09-05T12:05:43.956830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022041900/2022041900_47850
 
1.9%
2022041900/2022041900_11550
 
1.2%
2022041900/2022041900_40521
 
1.2%
2022041900/2022041900_12500
 
1.1%
2022041900/2022041900_32500
 
1.1%
2022041900/2022041900_56500
 
1.1%
2022041900/2022041900_55450
 
1.0%
2022041901/2022041901_1450
 
1.0%
2022041921/2022041921_52445
 
1.0%
2022041922/2022041922_1437
 
1.0%
Other values (373)39241
88.3%

Most occurring characters

ValueCountFrequency (%)
2338022
31.3%
0267358
24.8%
1128205
 
11.9%
498327
 
9.1%
995988
 
8.9%
/46795
 
4.3%
_44444
 
4.1%
324424
 
2.3%
515230
 
1.4%
78475
 
0.8%
Other values (2)11377
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number987406
91.5%
Other Punctuation46795
 
4.3%
Connector Punctuation44444
 
4.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2338022
34.2%
0267358
27.1%
1128205
 
13.0%
498327
 
10.0%
995988
 
9.7%
324424
 
2.5%
515230
 
1.5%
78475
 
0.9%
66107
 
0.6%
85270
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/46795
100.0%
Connector Punctuation
ValueCountFrequency (%)
_44444
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1078645
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2338022
31.3%
0267358
24.8%
1128205
 
11.9%
498327
 
9.1%
995988
 
8.9%
/46795
 
4.3%
_44444
 
4.1%
324424
 
2.3%
515230
 
1.4%
78475
 
0.8%
Other values (2)11377
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1078645
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2338022
31.3%
0267358
24.8%
1128205
 
11.9%
498327
 
9.1%
995988
 
8.9%
/46795
 
4.3%
_44444
 
4.1%
324424
 
2.3%
515230
 
1.4%
78475
 
0.8%
Other values (2)11377
 
1.1%

win_count
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct917
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean303154.1374
Minimum991
Maximum591567
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:44.036670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum991
5-th percentile97004
Q1179444
median279558
Q3453279
95-th percentile499902
Maximum591567
Range590576
Interquartile range (IQR)273835

Descriptive statistics

Standard deviation141769.3064
Coefficient of variation (CV)0.4676476054
Kurtosis-1.270612918
Mean303154.1374
Median Absolute Deviation (MAD)110089
Skewness-0.01351828215
Sum1.347338248 × 1010
Variance2.009853624 × 1010
MonotonicityNot monotonic
2022-09-05T12:05:44.123536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37948250
 
0.1%
29795950
 
0.1%
25560650
 
0.1%
26850350
 
0.1%
26851950
 
0.1%
26954050
 
0.1%
26955250
 
0.1%
26996950
 
0.1%
27896450
 
0.1%
27907250
 
0.1%
Other values (907)43944
98.9%
ValueCountFrequency (%)
99120
 
< 0.1%
160848
0.1%
201850
0.1%
221850
0.1%
389650
0.1%
422250
0.1%
423250
0.1%
500050
0.1%
526850
0.1%
548150
0.1%
ValueCountFrequency (%)
59156750
0.1%
58742048
0.1%
57575050
0.1%
54345150
0.1%
53717550
0.1%
52803150
0.1%
52545045
0.1%
52410433
0.1%
51910550
0.1%
51885350
0.1%

sha256
Categorical

HIGH CARDINALITY

Distinct872
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size347.3 KiB
0ad188c147c18398d5ee2e54be5ce6a3c196ed247a352851e22e51d3a9d92c7a
 
150
b2fe2aa2a8809247417e0f33c35e0c808682ad46b5f6488987787c6c0052fd87
 
100
b12699ae963bb426fbdcf4b69f08caec6c32213bf6a1a7b9201c08f908c32471
 
100
3db5f4cc1671c074697105754869e2ee80a830de3b41a6a5364415850d98ab87
 
100
79ef584a41008e2a42b6c48fa4a95eed727a0aa6a932d9dbf0f0d29f5c509c0b
 
100
Other values (867)
43894 

Length

Max length64
Median length64
Mean length64
Min length64

Characters and Unicode

Total characters2844416
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9
2nd row3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9
3rd row3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9
4th row3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9
5th row3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9

Common Values

ValueCountFrequency (%)
0ad188c147c18398d5ee2e54be5ce6a3c196ed247a352851e22e51d3a9d92c7a150
 
0.3%
b2fe2aa2a8809247417e0f33c35e0c808682ad46b5f6488987787c6c0052fd87100
 
0.2%
b12699ae963bb426fbdcf4b69f08caec6c32213bf6a1a7b9201c08f908c32471100
 
0.2%
3db5f4cc1671c074697105754869e2ee80a830de3b41a6a5364415850d98ab87100
 
0.2%
79ef584a41008e2a42b6c48fa4a95eed727a0aa6a932d9dbf0f0d29f5c509c0b100
 
0.2%
11e6868502b491a93475f42a8f5f3bf1cfd635820df750530618cf1849f1307d100
 
0.2%
7f6a104a98d7400c2adf22aa5407ac7a8854342ca1ba34512b8890c5a4201ad1100
 
0.2%
1fccb0fd26e47b82cb0522a39873711ea023e2dbf66e3c6e0435d267454485e7100
 
0.2%
d0e155e69192557cb2aa2095d814fa80cbfb1ce82a75ef611257f464ec771768100
 
0.2%
7684f8426a58c1f902e13e25c68b89a3f06a5418370a3f2aae2ba3cc3f62ebae100
 
0.2%
Other values (862)43394
97.6%

Length

2022-09-05T12:05:44.201691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0ad188c147c18398d5ee2e54be5ce6a3c196ed247a352851e22e51d3a9d92c7a150
 
0.3%
27a3dd3765f237db5971b5d1a599ade0f2aafd20691bd0bb89b5d28e0009ab5a100
 
0.2%
a80c2a2705371649dfcc32469b8fd69004a28d14847c7cbc2e294a38bdb5499d100
 
0.2%
312e6afc01e161f899ef512391d4286d0dbfdd792b237cc9323de7bbf28a08c9100
 
0.2%
ea2c57449ea90302b27b94e28702e8e6710196ab461c630e92c3c715930bde82100
 
0.2%
41422355eb3048b27e36b590dcd690ba74b5e776099e9abdcdb56c6a665877c4100
 
0.2%
a0127c5d5f8788e78fbf73149e2b529bf2e7600010f51dd7b502e4228b4f6765100
 
0.2%
15ef164c73d7770d6b96adf1a1cd8274dd3b53f26bb7fb10aee4f3a8b02fc264100
 
0.2%
7421988e586066bc9539e680a9622d4d084bbd71182eb32f24af29c2659cd374100
 
0.2%
152b4f077b594ad4fb17282643b4a30e91e41b98f19e4fde047877e5a9dda473100
 
0.2%
Other values (862)43394
97.6%

Most occurring characters

ValueCountFrequency (%)
2186358
 
6.6%
9183320
 
6.4%
f180164
 
6.3%
3180065
 
6.3%
0179588
 
6.3%
8178593
 
6.3%
5178294
 
6.3%
a177140
 
6.2%
6176776
 
6.2%
d176279
 
6.2%
Other values (6)1047839
36.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1785270
62.8%
Lowercase Letter1059146
37.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2186358
10.4%
9183320
10.3%
3180065
10.1%
0179588
10.1%
8178593
10.0%
5178294
10.0%
6176776
9.9%
7175823
9.8%
4174795
9.8%
1171658
9.6%
Lowercase Letter
ValueCountFrequency (%)
f180164
17.0%
a177140
16.7%
d176279
16.6%
e175707
16.6%
c175081
16.5%
b174775
16.5%

Most occurring scripts

ValueCountFrequency (%)
Common1785270
62.8%
Latin1059146
37.2%

Most frequent character per script

Common
ValueCountFrequency (%)
2186358
10.4%
9183320
10.3%
3180065
10.1%
0179588
10.1%
8178593
10.0%
5178294
10.0%
6176776
9.9%
7175823
9.8%
4174795
9.8%
1171658
9.6%
Latin
ValueCountFrequency (%)
f180164
17.0%
a177140
16.7%
d176279
16.6%
e175707
16.6%
c175081
16.5%
b174775
16.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII2844416
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2186358
 
6.6%
9183320
 
6.4%
f180164
 
6.3%
3180065
 
6.3%
0179588
 
6.3%
8178593
 
6.3%
5178294
 
6.3%
a177140
 
6.2%
6176776
 
6.2%
d176279
 
6.2%
Other values (6)1047839
36.8%

imp_hash
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size347.3 KiB
fbcff5951ad0c204f4744c629548c6c6
44444 

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters1422208
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfbcff5951ad0c204f4744c629548c6c6
2nd rowfbcff5951ad0c204f4744c629548c6c6
3rd rowfbcff5951ad0c204f4744c629548c6c6
4th rowfbcff5951ad0c204f4744c629548c6c6
5th rowfbcff5951ad0c204f4744c629548c6c6

Common Values

ValueCountFrequency (%)
fbcff5951ad0c204f4744c629548c6c644444
100.0%

Length

2022-09-05T12:05:44.262928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-05T12:05:44.328596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
fbcff5951ad0c204f4744c629548c6c644444
100.0%

Most occurring characters

ValueCountFrequency (%)
c222220
15.6%
4222220
15.6%
f177776
12.5%
5133332
9.4%
6133332
9.4%
988888
 
6.2%
088888
 
6.2%
288888
 
6.2%
b44444
 
3.1%
144444
 
3.1%
Other values (4)177776
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number888880
62.5%
Lowercase Letter533328
37.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4222220
25.0%
5133332
15.0%
6133332
15.0%
988888
 
10.0%
088888
 
10.0%
288888
 
10.0%
144444
 
5.0%
744444
 
5.0%
844444
 
5.0%
Lowercase Letter
ValueCountFrequency (%)
c222220
41.7%
f177776
33.3%
b44444
 
8.3%
a44444
 
8.3%
d44444
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
Common888880
62.5%
Latin533328
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
4222220
25.0%
5133332
15.0%
6133332
15.0%
988888
 
10.0%
088888
 
10.0%
288888
 
10.0%
144444
 
5.0%
744444
 
5.0%
844444
 
5.0%
Latin
ValueCountFrequency (%)
c222220
41.7%
f177776
33.3%
b44444
 
8.3%
a44444
 
8.3%
d44444
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1422208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c222220
15.6%
4222220
15.6%
f177776
12.5%
5133332
9.4%
6133332
9.4%
988888
 
6.2%
088888
 
6.2%
288888
 
6.2%
b44444
 
3.1%
144444
 
3.1%
Other values (4)177776
12.5%

sec_chi2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct175
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2203016.473
Minimum57874.69
Maximum73113600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:44.397802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum57874.69
5-th percentile358466.16
Q11044480
median1044480
Q32088960
95-th percentile7311360
Maximum73113600
Range73055725.31
Interquartile range (IQR)1044480

Descriptive statistics

Standard deviation6622403.846
Coefficient of variation (CV)3.006061882
Kurtosis103.8191048
Mean2203016.473
Median Absolute Deviation (MAD)0
Skewness9.999100205
Sum9.791086414 × 1010
Variance4.385623269 × 1013
MonotonicityNot monotonic
2022-09-05T12:05:44.488076image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
104448026991
60.7%
20889607405
 
16.7%
73113603106
 
7.0%
358466.16917
 
2.1%
2262192.25917
 
2.1%
962731.13917
 
2.1%
66544.3917
 
2.1%
842616.38917
 
2.1%
57874.69917
 
2.1%
951366.5565
 
1.3%
Other values (165)875
 
2.0%
ValueCountFrequency (%)
57874.69917
2.1%
66544.3917
2.1%
105580.311
 
< 0.1%
106287.131
 
< 0.1%
109876.631
 
< 0.1%
109919.631
 
< 0.1%
113882.251
 
< 0.1%
113937.751
 
< 0.1%
114061.881
 
< 0.1%
114343.631
 
< 0.1%
ValueCountFrequency (%)
73113600361
 
0.8%
73113603106
7.0%
2262192.25917
 
2.1%
22483541
 
< 0.1%
2248330.51
 
< 0.1%
2248190.51
 
< 0.1%
2247969.51
 
< 0.1%
2246972.51
 
< 0.1%
20889607405
16.7%
2063693.381
 
< 0.1%

sec_entropy
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct102
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6538920889
Minimum0
Maximum7.84
Zeros37863
Zeros (%)85.2%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:44.585622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile6.12
Maximum7.84
Range7.84
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.890411814
Coefficient of variation (CV)2.891014964
Kurtosis6.921051665
Mean0.6538920889
Median Absolute Deviation (MAD)0
Skewness2.892108035
Sum29061.58
Variance3.573656827
MonotonicityNot monotonic
2022-09-05T12:05:44.677631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
037863
85.2%
6.12917
 
2.1%
7.38917
 
2.1%
0.48917
 
2.1%
7.84917
 
2.1%
1.04917
 
2.1%
5.33917
 
2.1%
2.97565
 
1.3%
2.98346
 
0.8%
0.967
 
< 0.1%
Other values (92)161
 
0.4%
ValueCountFrequency (%)
037863
85.2%
0.231
 
< 0.1%
0.321
 
< 0.1%
0.332
 
< 0.1%
0.48917
 
2.1%
0.521
 
< 0.1%
0.531
 
< 0.1%
0.632
 
< 0.1%
0.641
 
< 0.1%
0.866
 
< 0.1%
ValueCountFrequency (%)
7.84917
2.1%
7.38917
2.1%
6.12917
2.1%
5.695
 
< 0.1%
5.662
 
< 0.1%
5.431
 
< 0.1%
5.41
 
< 0.1%
5.33917
2.1%
5.291
 
< 0.1%
5.283
 
< 0.1%

sec_md5
Categorical

HIGH CARDINALITY

Distinct176
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size347.3 KiB
620f0b67a91f7f74151bc5be745b7110
26991 
0829f71740aab1ab98b33eae21dee122
7405 
cf845a781c107ec1346e849c9dd1b7e8
3106 
a43aef6b6f939e7959b81ec2e8806ecb
 
917
41432e60924ed4a91548fe43265fa3d1
 
917
Other values (171)
5108 

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters1422208
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique160 ?
Unique (%)0.4%

Sample

1st rowa43aef6b6f939e7959b81ec2e8806ecb
2nd row8d17b27fb2e22ce52dc7953821236946
3rd row41432e60924ed4a91548fe43265fa3d1
4th rowee4f14b17ad059c6b83e4503a65b81f5
5th rowc694e1149d43f57e80047a0bc754f9d3

Common Values

ValueCountFrequency (%)
620f0b67a91f7f74151bc5be745b711026991
60.7%
0829f71740aab1ab98b33eae21dee1227405
 
16.7%
cf845a781c107ec1346e849c9dd1b7e83106
 
7.0%
a43aef6b6f939e7959b81ec2e8806ecb917
 
2.1%
41432e60924ed4a91548fe43265fa3d1917
 
2.1%
ee4f14b17ad059c6b83e4503a65b81f5917
 
2.1%
c694e1149d43f57e80047a0bc754f9d3917
 
2.1%
a12e0f6f3ca72d1748b5dbba51b45e19917
 
2.1%
cf4f5746abe0554542e6173fc6438b6c917
 
2.1%
8d17b27fb2e22ce52dc7953821236946565
 
1.3%
Other values (166)875
 
2.0%

Length

2022-09-05T12:05:44.759474image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
620f0b67a91f7f74151bc5be745b711026991
60.7%
0829f71740aab1ab98b33eae21dee1227405
 
16.7%
cf845a781c107ec1346e849c9dd1b7e83106
 
7.0%
a43aef6b6f939e7959b81ec2e8806ecb917
 
2.1%
41432e60924ed4a91548fe43265fa3d1917
 
2.1%
ee4f14b17ad059c6b83e4503a65b81f5917
 
2.1%
c694e1149d43f57e80047a0bc754f9d3917
 
2.1%
a12e0f6f3ca72d1748b5dbba51b45e19917
 
2.1%
cf4f5746abe0554542e6173fc6438b6c917
 
2.1%
8d17b27fb2e22ce52dc7953821236946565
 
1.3%
Other values (166)875
 
2.0%

Most occurring characters

ValueCountFrequency (%)
1191689
13.5%
7171519
12.1%
b147761
10.4%
0109599
7.7%
f105021
 
7.4%
5102310
 
7.2%
494223
 
6.6%
e84046
 
5.9%
a72273
 
5.1%
671850
 
5.1%
Other values (6)271917
19.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number940017
66.1%
Lowercase Letter482191
33.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1191689
20.4%
7171519
18.2%
0109599
11.7%
5102310
10.9%
494223
10.0%
671850
 
7.6%
268366
 
7.3%
961006
 
6.5%
838008
 
4.0%
331447
 
3.3%
Lowercase Letter
ValueCountFrequency (%)
b147761
30.6%
f105021
21.8%
e84046
17.4%
a72273
15.0%
c50189
 
10.4%
d22901
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Common940017
66.1%
Latin482191
33.9%

Most frequent character per script

Common
ValueCountFrequency (%)
1191689
20.4%
7171519
18.2%
0109599
11.7%
5102310
10.9%
494223
10.0%
671850
 
7.6%
268366
 
7.3%
961006
 
6.5%
838008
 
4.0%
331447
 
3.3%
Latin
ValueCountFrequency (%)
b147761
30.6%
f105021
21.8%
e84046
17.4%
a72273
15.0%
c50189
 
10.4%
d22901
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1422208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1191689
13.5%
7171519
12.1%
b147761
10.4%
0109599
7.7%
f105021
 
7.4%
5102310
 
7.2%
494223
 
6.6%
e84046
 
5.9%
a72273
 
5.1%
671850
 
5.1%
Other values (6)271917
19.1%

raw_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24557.47565
Minimum4096
Maximum507904
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:44.815382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum4096
5-th percentile4096
Q14096
median4096
Q38192
95-th percentile32768
Maximum507904
Range503808
Interquartile range (IQR)4096

Descriptive statistics

Standard deviation80934.54156
Coefficient of variation (CV)3.2957191
Kurtosis24.94119374
Mean24557.47565
Median Absolute Deviation (MAD)0
Skewness4.978015168
Sum1091432448
Variance6550400017
MonotonicityNot monotonic
2022-09-05T12:05:44.877575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
409628940
65.1%
81929275
 
20.9%
286723110
 
7.0%
32768917
 
2.1%
507904917
 
2.1%
225280917
 
2.1%
286720366
 
0.8%
2129922
 
< 0.1%
ValueCountFrequency (%)
409628940
65.1%
81929275
 
20.9%
286723110
 
7.0%
32768917
 
2.1%
2129922
 
< 0.1%
225280917
 
2.1%
286720366
 
0.8%
507904917
 
2.1%
ValueCountFrequency (%)
507904917
 
2.1%
286720366
 
0.8%
225280917
 
2.1%
2129922
 
< 0.1%
32768917
 
2.1%
286723110
 
7.0%
81929275
 
20.9%
409628940
65.1%

virtual_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct74
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22068.3459
Minimum132
Maximum504779
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:44.956747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum132
5-th percentile336
Q1877
median2302
Q34751
95-th percentile32686
Maximum504779
Range504647
Interquartile range (IQR)3874

Descriptive statistics

Standard deviation80911.62619
Coefficient of variation (CV)3.666411002
Kurtosis24.82677084
Mean22068.3459
Median Absolute Deviation (MAD)1731
Skewness4.963146834
Sum980805565
Variance6546691253
MonotonicityNot monotonic
2022-09-05T12:05:45.050572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5714207
 
9.5%
278563110
 
7.0%
18463067
 
6.9%
23022537
 
5.7%
47512151
 
4.8%
43882040
 
4.6%
30381943
 
4.4%
4311800
 
4.1%
9541684
 
3.8%
47281565
 
3.5%
Other values (64)20340
45.8%
ValueCountFrequency (%)
132421
 
0.9%
180231
 
0.5%
24964
 
0.1%
26217
 
< 0.1%
26848
 
0.1%
311826
1.9%
318597
 
1.3%
336917
2.1%
3645
 
< 0.1%
4311800
4.1%
ValueCountFrequency (%)
504779917
 
2.1%
282996366
 
0.8%
223348917
 
2.1%
2107372
 
< 0.1%
32686917
 
2.1%
278563110
7.0%
81501
 
< 0.1%
813096
 
0.2%
8104684
 
1.5%
809718
 
< 0.1%

virtual_address
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct166
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean917060.5216
Minimum4096
Maximum1847296
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size347.3 KiB
2022-09-05T12:05:45.145181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum4096
5-th percentile45056
Q1819200
median962560
Q31073152
95-th percentile1277952
Maximum1847296
Range1843200
Interquartile range (IQR)253952

Descriptive statistics

Standard deviation283631.6676
Coefficient of variation (CV)0.3092834779
Kurtosis3.461868248
Mean917060.5216
Median Absolute Deviation (MAD)126976
Skewness-1.691877275
Sum4.075783782 × 1010
Variance8.044692286 × 1010
MonotonicityNot monotonic
2022-09-05T12:05:45.237453image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4096917
 
2.1%
782336917
 
2.1%
798720917
 
2.1%
794624917
 
2.1%
786432917
 
2.1%
36864917
 
2.1%
557056917
 
2.1%
552960917
 
2.1%
45056917
 
2.1%
802816898
 
2.0%
Other values (156)35293
79.4%
ValueCountFrequency (%)
4096917
2.1%
36864917
2.1%
45056917
2.1%
552960917
2.1%
557056917
2.1%
782336917
2.1%
786432917
2.1%
794624917
2.1%
798720917
2.1%
802816898
2.0%
ValueCountFrequency (%)
18472961
 
< 0.1%
16302081
 
< 0.1%
16138241
 
< 0.1%
16097281
 
< 0.1%
16056321
 
< 0.1%
16015363
< 0.1%
15974402
< 0.1%
15892481
 
< 0.1%
15687681
 
< 0.1%
15605762
< 0.1%

sec_name
Categorical

HIGH CARDINALITY

Distinct1495
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size347.3 KiB
.text
 
917
.data
 
917
.pdata
 
917
.EXP
 
917
.rsrc
 
917
Other values (1490)
39859 

Length

Max length7
Median length6
Mean length5.500607506
Min length4

Characters and Unicode

Total characters244469
Distinct characters30
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique559 ?
Unique (%)1.3%

Sample

1st row.text
2nd row.rdata
3rd row.data
4th row.pdata
5th row.EXP

Common Values

ValueCountFrequency (%)
.text917
 
2.1%
.data917
 
2.1%
.pdata917
 
2.1%
.EXP917
 
2.1%
.rsrc917
 
2.1%
.reloc917
 
2.1%
.tbrtao917
 
2.1%
.rdata917
 
2.1%
.jubcj548
 
1.2%
.lrvm499
 
1.1%
Other values (1485)36061
81.1%

Length

2022-09-05T12:05:45.324661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
text917
 
2.1%
reloc917
 
2.1%
rdata917
 
2.1%
tbrtao917
 
2.1%
data917
 
2.1%
rsrc917
 
2.1%
pdata917
 
2.1%
exp917
 
2.1%
jubcj548
 
1.2%
lrvm499
 
1.1%
Other values (1485)36061
81.1%

Most occurring characters

ValueCountFrequency (%)
.44444
18.2%
r14056
 
5.7%
t12595
 
5.2%
a12308
 
5.0%
d9597
 
3.9%
c9291
 
3.8%
e9245
 
3.8%
x8819
 
3.6%
k8584
 
3.5%
p8353
 
3.4%
Other values (20)107177
43.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter197274
80.7%
Other Punctuation44444
 
18.2%
Uppercase Letter2751
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r14056
 
7.1%
t12595
 
6.4%
a12308
 
6.2%
d9597
 
4.9%
c9291
 
4.7%
e9245
 
4.7%
x8819
 
4.5%
k8584
 
4.4%
p8353
 
4.2%
f7814
 
4.0%
Other values (16)96612
49.0%
Uppercase Letter
ValueCountFrequency (%)
P917
33.3%
X917
33.3%
E917
33.3%
Other Punctuation
ValueCountFrequency (%)
.44444
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin200025
81.8%
Common44444
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r14056
 
7.0%
t12595
 
6.3%
a12308
 
6.2%
d9597
 
4.8%
c9291
 
4.6%
e9245
 
4.6%
x8819
 
4.4%
k8584
 
4.3%
p8353
 
4.2%
f7814
 
3.9%
Other values (19)99363
49.7%
Common
ValueCountFrequency (%)
.44444
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII244469
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.44444
18.2%
r14056
 
5.7%
t12595
 
5.2%
a12308
 
5.0%
d9597
 
3.9%
c9291
 
3.8%
e9245
 
3.8%
x8819
 
3.6%
k8584
 
3.5%
p8353
 
3.4%
Other values (20)107177
43.8%

Interactions

2022-09-05T12:05:42.385363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:37.579946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.231166image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.893372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.561784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.465427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.122021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.748591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.466415image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:37.662968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.308778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.975458image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.644188image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.545158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.198968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.826360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.547158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:37.744482image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.388346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.056979image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.725830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.624540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.275722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.903475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.631644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:37.825616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.472641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.141335image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.808890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.706844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.353348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.982711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.718621image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:37.908200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.558210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.226242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.893975image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.791074image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.433386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.063421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.806022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:37.991516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.646722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.314143image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.979015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.877613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.515803image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.148232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.886271image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.068457image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.727809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.393283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.057912image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.956210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.591012image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.225437image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.967599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.147095image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:38.809363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:39.474509image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:40.381258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.036479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:41.667022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-05T12:05:42.302936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-09-05T12:05:45.388859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-05T12:05:45.479124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-05T12:05:45.568144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-05T12:05:45.656205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-05T12:05:43.116452image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-05T12:05:43.307962image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexUnnamed: 0filenamewin_countsha256imp_hashsec_chi2sec_entropysec_md5raw_sizevirtual_sizevirtual_addresssec_name
05245524520220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c6358466.166.12a43aef6b6f939e7959b81ec2e8806ecb32768326864096.text
15246524620220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c6951366.502.978d17b27fb2e22ce52dc79538212369468192780236864.rdata
25247524720220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c62262192.257.3841432e60924ed4a91548fe43265fa3d150790450477945056.data
35248524820220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c6962731.130.48ee4f14b17ad059c6b83e4503a65b81f54096336552960.pdata
45249524920220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c666544.307.84c694e1149d43f57e80047a0bc754f9d3225280223348557056.EXP
55250525020220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c6842616.381.04a12e0f6f3ca72d1748b5dbba51b45e194096976782336.rsrc
65251525120220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c657874.695.33cf4f5746abe0554542e6173fc6438b6c81928061786432.reloc
75252525220220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c61044480.000.00620f0b67a91f7f74151bc5be745b711040962509794624.tbrtao
85253525320220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c61044480.000.00620f0b67a91f7f74151bc5be745b71104096571798720.jubcj
95254525420220329/2022032900/2022032900_109913edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9fbcff5951ad0c204f4744c629548c6c61044480.000.00620f0b67a91f7f74151bc5be745b711040962302802816.ipo

Last rows

df_indexUnnamed: 0filenamewin_countsha256imp_hashsec_chi2sec_entropysec_md5raw_sizevirtual_sizevirtual_addresssec_name
44434555503955550392022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c61044480.00.0620f0b67a91f7f74151bc5be745b711040963111077248.gzlcc
44435555504055550402022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c61044480.00.0620f0b67a91f7f74151bc5be745b711040964311081344.ubgnm
44436555504155550412022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c61044480.00.0620f0b67a91f7f74151bc5be745b711040964311085440.pdvbd
44437555504255550422022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c61044480.00.0620f0b67a91f7f74151bc5be745b7110409623021089536.nkh
44438555504355550432022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c61044480.00.0620f0b67a91f7f74151bc5be745b711040969051093632.cftn
44439555504455550442022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c61044480.00.0620f0b67a91f7f74151bc5be745b711040968771097728.fmkp
44440555504555550452022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c62088960.00.00829f71740aab1ab98b33eae21dee122819243881101824.zqdwjh
44441555504655550462022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c62088960.00.00829f71740aab1ab98b33eae21dee122819277821110016.chon
44442555504755550472022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c62088960.00.00829f71740aab1ab98b33eae21dee122819247511118208.nin
44443555504855550482022042101/2022042101_205915670ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388bfbcff5951ad0c204f4744c629548c6c61044480.00.0620f0b67a91f7f74151bc5be745b711040967301126400.rulqdd